A consumer perception towards products and services is known to be influenced by quality of customer care operations offered for said products or services. Call centres providing said customer care services typically record the agent-customer voice transactions for different reason. One such reason is to extract data from the audio conversations that can be utilized to improve customer experience and/or enhance business opportunity of an enterprise. Typically, transcription of naturally spoken audio conversation between an agent and a customer is converted to text, and the text is then used to derive analytics for further use. The process of converting naturally spoken audio conversation is prone to errors in spite of the strides made in the area of automatic speech recognition (ASR).